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Abstract. The natural pseudo-distance of spaces endowed with filtering 
functions is precious for shape classification and retrieval; its optimal esti¬ 
mate coming from persistence diagrams is the bottleneck distance, which 
unfortunately suffers from combinatorial explosion. A possible algebraic 
representation of persistence diagrams is offered by complex polynomials; 
since far polynomials represent far persistence diagrams, a fast compar¬ 
ison of the coefficient vectors can reduce the size of the database to be 
classified by the bottleneck distance. This article explores experimentally 
three transformations from diagrams to polynomials and three distances 
between the complex vectors of coefficients. 
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1 Introduction 

Persistent homology has already proven to be an effective tool for shape rep¬ 
resentation in various applications, in particular when the objects to be clas¬ 
sified, compared or retrieved have a natural origin. The interplay of geometry 
and topology in persistence makes it possible to capture qualitative aspects in 
a formal and computable way, yet it doesn’t suffer of the excessive freedom of 
mere topological equivalence. The privileged tool for shape comparison is the 
natural pseudo-distance m, which is scarcely computable. Luckily, persistence 
diagrams condense the essence of the shape concept of the observer in finite sets 
of points in the plane HBH3I; moreover, the bottleneck distance (a.k.a. matching 
distance) between persistence diagrams yields an optimal lower bound to the 
natural pseudo-distance wm- There is a problem: the bottleneck distance suf¬ 
fers from combinatorial explosion [8] , so it becomes hard to scan a large database 
when it comes to retrieval. Approximations, smart organization of the database 
according to the metric, progressive application of different classifiers come to 
help, but the problem is lightened, not solved. 

A paradigm shift came from an idea of Claudia Landi m represent the 
persistence diagram as the set of complex roots of a polynomial; then compari¬ 
son can be performed on coefficients. Two problems arise: one — which comes 
from the nature itself of persistence diagrams — is that in real situations there 
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are a lot of points near the “diagonal” {(it, v) £ K 2 : u = v}, due to noise so 
less meaningful in shape representation, but with a heavy impact on polyno¬ 
mial coefficients; another problem - coming from polynomial theory — is that 
little distance of polynomial roots implies little distance of coefficients, but the 
converse is false. 

A completely different polynomial representation of barcodes (equivalently: 
of persistence diagrams) is the one through tropical algebra m, closely adapting 
to the bottleneck distance. 

The contribution of the paper. We face the first problem — the existence of 
points near the diagonal - by performing a plane warping which takes all the line 
u = v to 0, so points near the diagonal actually become close together. Making 
noise points close and around zero diminishes their contribution to polynomial 
coefficients, above all to the first (and most relevant) ones: sum of roots, sum 
of pairwise products of roots, etc. As for the second problem — the fact that 
close coefficients may not mean close roots — we explore the use of polynomial 
comparison as a preprocessing phase in shape retrieval, i.e. as a very fast way 
of getting rid of definitely far objects, so that the bottleneck distance can be 
computed only for a small set of candidates, in the same line as [5j. The results 
are satisfactory: in some of our experiments the bottleneck distance even turns 
out to be unnecessary. 

2 Preliminaries 

In persistence, the shape of an object is usually studied by choosing a topological 
space X to represent it, and a function / : X — > R, called a filtering (or measur¬ 
ing) function , to define a family of subspaces X u = / -1 ((—oo, it]), it £ M, nested 
by inclusion, i.e. a filtration of X. The name “persistence” is bound to the idea 
of ranking topological features by importance, according to the length of their 
“life” through the filtration. The basic assumption is that the longer a feature 
survives, the more meaningful or coarse the feature is for shape description. In 
particular, structural properties of the space X are identified by features that 
once born never die; vice-versa, noise and shape details are characterized by a 
short life. To study how topological features vary in passing from a set of the 
filtration into a larger one we use homology. A nice feature of this approach is 
modularity: The choice of different filtering functions may account for different 
viewpoints on the same problem (different shape concepts) or for different tasks. 
For further details we refer to mm 

Persistent homology groups of the pair (A, /) - i.e. of the filtration {A„}„ g R 

are defined as follows. Given u < v £ R, we consider the inclusion of X u into 
X v . This inclusion induces a homomorphism of homology groups Hk(X u ) — > 
H k (X v ) for every k £ Z. Its image consists of the /c-homology classes that are 
born before or at the level u and are still alive at the level v and is called the 
kth persistent homology group of ( X,f ) at (u,v). When this group is finitely 
generated, we denote by /3%’ v (X,f) its rank. 
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The usual, compact description of persistent homology groups of ( X , /) is 
provided by the so-called persistence diagrams, i.e. multisets of points whose 
abscissa and ordinate are, respectively, the level at which fc-homology classes are 
created and the level at which they are annihilated through the filtration. If a 
homology class does not die along the filtration, the ordinate of the corresponding 
point is set to + 00 . 

At the moment, our approach to convert persistence diagrams into complex 
vectors can be applied only when neglecting these points at infinity. Hence, we 
focus on the subsets of proper points of the classical persistence diagrams, known 
in literature as ordinary persistence diagrams [6]. For simplicity we still call them 
“persistence diagrams”. We underline that this choice is not so restrictive since 
the number of points at infinity depend only on the homology of the space X, 
and persistent homology provide a finite distance between two pairs if and only 
if the considered spaces are homeomorphic. 

We use the following notation: A + = {(«, v) £ R 2 : u < v}, A = {(u,u) e 
ffi 2 : u = v}, and A+ = A + U A. 


Definition 1. Let k £ Z and (u,v) £ A + . The multiplicity p-k(u,v) of (u,v) is 
the finite non-negative number defined by 


lim 
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Definition 2. The fcth-persistence diagram Dk{X,f) is the set of all points 
(u,v) € A + such that p,k{u,v) > 0, counted with their multiplicity, union the 
points of A, counted with infinite multiplicity. We call proper points the points 
of a persistence diagram lying on A + . 



Fig. 1. Left: The height function / on the space X. Right: The associated Oth- 
persistence diagram Do(X,f). 


Figure [T] displays an example of persistence diagram for k = 0. The surface 
X C R 3 is filtered by the height function /. Do(X,f) has three proper points 
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Pi,P 2 tP 3 since the abscissa of these points corresponds to the level at which new 
connected components are born along the filtration, while the ordinate identifies 
the level at which these connected components merge with existing ones. In terms 
of multiplicity, this means that po(pi) > 0, i = 1,2,3, and po{p) = 0 for every 
other point p £ A + . To see, for example, that /io(pi) = 2, where pi = (a,b), 
it is sufficient to observe that, for every e > 0 sufficiently small, it holds that 
Po +e ’ b ~ e (XJ) = 4, ff +e ’ b+E (XJ) = 2, ft- s ' b - E (XJ) = /3 0 a - £ ’ b+e (X,/) = 1, 
and apply Definition [TJ In an analogous way, it can be observed that /io (P2) = 
Mo(P3) = 1- 

Persistence diagrams comparison is usually carried out through the so called 
bottleneck distance because of the robustness of these descriptors with respect 
to it. Roughly, small changing in a given filtering function (w.r.t. the max- 
norm) produces just a small changing in the associated persistence diagram w.r.t. 
the bottleneck distance [70. The bottleneck distance between two persistence 
diagrams measures the cost of finding a correspondence between their points. In 
doing this, the cost of taking a point p to a point p' is measured as the minimum 
between the cost of moving one point onto the other and the cost of moving both 
points onto the diagonal. In particular, the matching of a proper point p with a 
point of A can be interpreted as the destruction of the point p. Formally: 


Definition 3. Let D, D' be two persistence diagrams. The bottleneck distance 
ds (D,D') is defined as 


dB(D,D') = minmaxd(p, &(p)), 

a p£D 

where a varies among all the bijections between D and D' and 
d (( u , v) , (■ u' , v')) = min |max{|zt — u'\, |u — i/|} , max 
for every (u, v ), (u 1 , v') £ A + . 

3 Persistence diagrams vs complex vectors 

Driven by the awareness that, in the experimental framework, evaluating the 
bottleneck distance can be computationally expensive, making its usage not 
practicable on large datasets, in this work we propose a new procedure based 
on the preliminary idea introduced in m- We translate the problem of compar¬ 
ing directly two persistence diagrams through the bottleneck distance into the 
problem of comparing complex vectors associated with each persistence diagram 
through appropriate metrics between vectors. The components of these complex 
vectors are complex polynomials’ coefficients obtained as follows. Firstly, we de¬ 
fine a certain transformation taking points of persistence diagrams to the set 
of complex numbers. Secondly, we construct a complex polynomial having the 
obtained complex numbers as roots. 

In this paper, we consider the three transformations below: 
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- R : A+ —► C, with R(u, v) = u + iv, 


- S : A+ —»• C, with S(u,v) 

- T : A+ —>• C, with T(u,v) 


v — u 


< cl\[2 
[( 0 , 0 ), 
v — u 


2 


• (u + iv), if(u, v) ^ (0,0) 

5 

otherwise 

(cos a — sin a + i(cosa + sin a)), 


where a = y/u 2 + v 2 . 

R,S,T are continuous maps; R and S are also injective on A+ and A + , 
respectively. We define the multiplicity of a complex number in the range of 
i?, S , T to be the sum of the multiplicities of the points belonging to its preimage 
(this is necessary because of the non-injectivity of T on A + , although a preimage 
containing more than one proper point of the diagram has zero probability to 
occur). The main differences among these deformations are the following: the 
deformation R acts as the identity, just passing from R 2 to C; the deformation 
S warps the diagonal A to the origin, and takes points of A + to points of 
{z € C : Re(z) < Im(z)}; the deformation T warps the diagonal A to the 
origin, and takes points of A + to points of C. An example showing how S and 
T transform a persistence diagram is represented in Figure [2] Both S and T 




Fig. 2. A persistence diagram with its transformations S (left) and T (right). Same 
colors identify same lengths. 


seemed to be preferable to R because points near A — due to noise — have to 
be considered close to each other in the bottleneck distance, although they may 
be very far apart in Euclidean distance. Taking them to be all near the origin 
would then also reduce their impact in the sums and sums of products which will 
build the polynomial coefficients we are going to compare. In particular, T was 
designed to distribute the image of those noise points around zero, whereas S 
makes them near zero, but all on one side: in the half-plane of C corresponding 
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to A+. T has two drawbacks: it is not injective on A + and does not behave well 
with respect to simple transformations. 

Let D be a persistence diagram, and p\ = (u \, v\p s = (u s , v s ) its proper 
points with multiplicity rq,..., r s , respectively. Let now the complex numbers 
Z\ ,..., z s be obtained from pi, ... ,p s by one of the transformations R , S or 
T. We associate to D the complex polynomial /n(t) = Tly=i~~ z jY j ■ We are 
actually interested in the coefficient sequence of which we can compute 

by Viete’s formulas (see Algorithm 2). 

Once we have the polynomials /n(t) = t n — ait n_1 + • • • + (—1 ya i t n ~' 1 + 
■■■ + (-1 ) n a n and f D ,(t) =t m - a^t™- 1 + • • • + (-1 + • • • + 

corresponding to persistence diagrams D , D' , we face a first problem, given by 
the possibly different degrees n and m (m < n say). Because of their expression 
in terms of roots, we prefer to compare coefficients with the same index, rather 
than coefficients relative to the same degree of t. We manage this problem by 
adding n — m null coefficients to /D'(t), be. multiplying /zv(t) by t n ~ m , which 
amounts to adding the complex number zero with multiplicity n — m. In so 
doing, we can build two vectors of complex numbers (ai,..., a n ), (aj,..., a ' n ) 
of the same length and are ready to compute a distance between them. By 
continuity of Viete’s formulas, close roots imply close coefficients. Hence, two 
persistence diagrams that are close in terms of bottleneck distance have close 
associated polynomials. Unfortunately, the converse is not true. 

Preliminary tests suggested that the first coefficients were more meaning¬ 
ful; therefore we experimented with different distances on two complex vectors 
(ai,..., a*,), (a [,..., a' k ), k £ {1,..., n}, one treating all coefficients equally, two 
which give decreasing value to coefficients of increasing indices. The chosen met¬ 
rics are the following: 


k 

~ °jl> 

3 =1 


d 2=J2 

3 =1 


a — a a 


d * = J2\ 

3 =1 


|l/j 


Algorithms and computational analysis. The algorithms below resume the 
principal steps of our scheme. F in Algorithm 1 (line 2) and d in Algorithm 3 
(line 4) correspond, respectively, to one of the transformations R, S, T and one 
of the metrics di, d 2 , d% previously defined. 

Algorithm 1: ComplexLists 

Input: List A of proper points of a persistence diagram D 1 
M = max^ijA : A £ database Db} 

Output: List B of complex numbers associated with D 


1: 

for each (u, v) £ A 

4: 

if \B\ < M 

2: 

replace (u, v ) by F{u , v ) 

5: 

append M — \B\ zeros to B 

3: 

end for 

6: 

end if 
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Algorithm 2: ComplexVectors 

Input: M, B = list{z\, ..., Zm) associated with D, k £ [0, M) 
Output: Complex vector 14 associated with D 


1: 

set t4 = listQ 



2: 

for j £ {l,...,fc} 



3: 

compute Cj(zi,.. 

,Z M )= E Z h -Zi 2 -. 

• ' Zia 



I<i\<i2<...<ij <M 


4: 

append Cj to 14 



5: 

end for 




Algorithm 3: VectorsComparison 

Input: L = {14 : 14 complex vector associated with I? for eachD £ Db} 
Output: Matrix of distances d( 14,140 


1: 

set M = (0 ij), i,j 

= \L\ 

4: 

replace 0y, 0 J7 ; by d(i,j) 

2: 

for each i £ {1,.. 

■,\L\} 

5: 

end for 

3: 

for each j £ {i, 


6: 

end for 


Let N = \L\ = \Db\. It is easily seen that the computational complexities of 
Algorithms 1 and 3 are C\ = O(M-N) and C 3 = 0(k-N 2 ), respectively. The cost 
of Algorithm 2 depends on how we have implemented the computation of Viete 
formulas. Using the induction on the index j, we have C 2 = 0{{2k 2 + k ■ M ) • N). 

We want to show that our approach to database classification, in general, 
results to be cheaper than using the bottleneck distance by a suitable choice 
of the number k of computed coefficients. Our comparison is realized in terms 
of storage locations and not in terms of running time performances since the 
algorithm proposed here and the one based on the bottleneck distance run on 
different platforms. We recall that the cost of computing the bottleneck dis¬ 
tance cIb(D,D') is Cb = O ((r + r') 3 / 2 log(r + r')) if A, A' are the subsets 
of proper points of two persistence diagrams D,D' with |A| = r, \A'\ = r' 
(see [Hj). Instead, using our scheme, with N = 2 and M = max(r, r'), we 
get 0((max(r, r') + 2k 2 + k ■ max(r, r') + 2k), so C = 0(k ■ (max(r, r') + k)). 
Since k < max(r, r'), in the worst case, we have C = O ((max(r,r')) 2 ) which 
is higher than Cb, but for pre-processing we may choose a favorable k (e.g. 


k = 


yj max(r, r')J). Also consider that, for a retrieval task, the heavy part 


of the computation (Viete’s formulas) for the database is performed offline; 
in other words: if we store the coefficient vectors instead of the proper points 
lists, then the search can be performed by a distance computation of complexity 
O (max(r, r'))! 


4 Experimental results 

This section is devoted to validate the theoretical framework introduced in Sec¬ 
tion [3] In particular, through some experiments on persistence diagrams for 0th 
homology degree associated with 3D-models represented by triangle meshes, we 












will prove that our approach allows to perform the persistence diagrams com¬ 
parison without greatly affecting (and in some cases improving) the goodness of 
results in terms of database classification. 

To test the proposed framework we considered a database of 228 3D-surface 
mesh models introduced in [2]. The database is divided into 12 classes, each con¬ 
taining 19 elements obtained as follows: A null model taken from the Non Rigid 
World Benchmark [3] is considered together with six non-rigid transformations 
applied to it at three different strength levels. An example of the transformations 
and their greatest strength levels is given in Figure [3] 



Fig. 4. The null model “seahorseO” depicted with its center of mass B and the associ¬ 
ated vector w, which define the filtering functions fp and fp. 


Two filtering functions fp, fp have been defined on the models of the database 
as follows: For each triangle mesh M of vertices {ui,..., v n }, the center of mass 
B is computed, and the model is normalized to be contained in a unit sphere. 
Further, a vector w is defined as 

E”=i Ik ^ b \\- ■ 

The function fp is the distance from the line L parallel to w and passing through 
B , while the function fp is the distance from the plane P orthogonal to w and 
passing through B (see, as an example, Figure 0]). The values of fp and fp are 
then normalized so that they range in the interval [0,1]. These filtering functions 
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are translation and rotation invariant, as well as scale invariant because of a priori 
normalization of the models. Moreover, the considered models are sufficiently 
generic (no point-symmetries occur etc...) to ensure that the vector w is well- 
defined over the whole database, as well as its orientation stability. 

Our experimental results are synthesized in Tables [T] and [2] in terms of pre¬ 
cision/recall (PR) graphs when the filtering functions /z,,/p, respectively, are 
considered. Before going into details, we want to emphasize that our intent is 
not to validate the usage of persistence for shape comparison, retrieval or clas¬ 
sification. In fact, as a reader coming from the retrieval domain will probably 
note, the PR graphs reported in this paper are below the state of the art. This 
depends on the fact that, in general, good retrieval performances can be achieved 
only taking into account different filtering functions that give rise to a battery 
of descriptors associated with each model in the database. 


Table 1. PR graphs related to the filtering function /l, when Ot/i-persistence diagrams 
are compared directly through the bottleneck distance and in terms of the first k 
components of the complex vectors obtained from the transformations R (first row), 
S (second row) and T (third row) through the distances d\ (first column), d >2 (second 
column) and dz (third column). 
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Table 2. PR graphs related to the filtering function /p, when Ot/i-persistence diagrams 
are compared directly through the bottleneck distance and in terms of the first k 
components of the complex vectors obtained from the transformations R (first row), 
S (second row) and T (third row) through the distances d\ (first column), efo (second 
column) and d 3 (third column). 



What these plots aim to show is the comparison of the performances when the 
database classification is carried out through the computation of the bottleneck 
distance cLb between persistence diagrams or the computation of the distances 
d \, g ?2 and c ?3 between the first k components of the complex vectors obtained 
through the transformations R, S and T, for different values of k (see Section 
[3] for the definitions of R , S, T, d±, d ,2 and CI 3 ). As it can be easily observed, 
increasing the value of k from the smallest to the biggest number of proper points 
in the persistence diagrams of our database, the PR graphs do not change so 
sensibly. This means that the most important information of the persistence 
diagram is contained in the first few vector components, the ones corresponding 
to the coefficients of monomials with highest degree. Moreover, we point out 
also that PR graphs related to vectors which are induced by transformations 
warping the diagonal A to a point (second and third rows in Tables Q] and 
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[2j provide better results than by acting as the identity (first row). This fact 
depends on the properties of polynomial coefficients: Indeed roots corresponding 
to points of persistence diagrams farther from the diagonal weigh more than 
those closer to it. Hence, applying transformations S and T corresponds, in some 
sense, to providing points of a persistence diagram with a weight that follows 
the paradigm of persistence: The longer the lifespan of a homological class, the 
higher the weight associated with the point having as coordinates the birth and 
death dates of this class. This outcome is moreover strengthened by the usage 
of the distance cfa (third column in Tables |T| and [2| since it greatly enhances the 
contribution of the first vector components to their dissimilarity measure at the 
expense of the last. 

Finally, note that the precision values at high recall i.e. by retrieving a 
large number of objects — are always fairly comparable with the values relative 
to the bottleneck distance. This assures us that complex vector comparison can 
act as a fast and reliable preprocessing scheme for reducing the set of objects to 
be fed to the generally more precise bottleneck distance. 
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